Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Res Notes ; 16(1): 219, 2023 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-37710302

RESUMO

OBJECTIVES: This release note describes the Maize GxE project datasets within the Genomes to Fields (G2F) Initiative. The Maize GxE project aims to understand genotype by environment (GxE) interactions and use the information collected to improve resource allocation efficiency and increase genotype predictability and stability, particularly in scenarios of variable environmental patterns. Hybrids and inbreds are evaluated across multiple environments and phenotypic, genotypic, environmental, and metadata information are made publicly available. DATA DESCRIPTION: The datasets include phenotypic data of the hybrids and inbreds evaluated in 30 locations across the US and one location in Germany in 2020 and 2021, soil and climatic measurements and metadata information for all environments (combination of year and location), ReadMe, and description files for each data type. A set of common hybrids is present in each environment to connect with previous evaluations. Each environment had a collaborator responsible for collecting and submitting the data, the GxE coordination team combined all the collected information and removed obvious erroneous data. Collaborators received the combined data to use, verify and declare that the data generated in their own environments was accurate. Combined data is released to the public with minimal filtering to maintain fidelity to the original data.


Assuntos
Alocação de Recursos , Zea mays , Zea mays/genética , Estações do Ano , Genótipo , Alemanha
2.
BMC Res Notes ; 16(1): 148, 2023 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-37461058

RESUMO

OBJECTIVES: The Genomes to Fields (G2F) 2022 Maize Genotype by Environment (GxE) Prediction Competition aimed to develop models for predicting grain yield for the 2022 Maize GxE project field trials, leveraging the datasets previously generated by this project and other publicly available data. DATA DESCRIPTION: This resource used data from the Maize GxE project within the G2F Initiative [1]. The dataset included phenotypic and genotypic data of the hybrids evaluated in 45 locations from 2014 to 2022. Also, soil, weather, environmental covariates data and metadata information for all environments (combination of year and location). Competitors also had access to ReadMe files which described all the files provided. The Maize GxE is a collaborative project and all the data generated becomes publicly available [2]. The dataset used in the 2022 Prediction Competition was curated and lightly filtered for quality and to ensure naming uniformity across years.


Assuntos
Genoma de Planta , Zea mays , Fenótipo , Zea mays/genética , Genótipo , Genoma de Planta/genética , Grão Comestível/genética
3.
BMC Genom Data ; 24(1): 29, 2023 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-37231352

RESUMO

OBJECTIVES: This report provides information about the public release of the 2018-2019 Maize G X E project of the Genomes to Fields (G2F) Initiative datasets. G2F is an umbrella initiative that evaluates maize hybrids and inbred lines across multiple environments and makes available phenotypic, genotypic, environmental, and metadata information. The initiative understands the necessity to characterize and deploy public sources of genetic diversity to face the challenges for more sustainable agriculture in the context of variable environmental conditions. DATA DESCRIPTION: Datasets include phenotypic, climatic, and soil measurements, metadata information, and inbred genotypic information for each combination of location and year. Collaborators in the G2F initiative collected data for each location and year; members of the group responsible for coordination and data processing combined all the collected information and removed obvious erroneous data. The collaborators received the data before the DOI release to verify and declare that the data generated in their own locations was accurate. ReadMe and description files are available for each dataset. Previous years of evaluation are already publicly available, with common hybrids present to connect across all locations and years evaluated since this project's inception.


Assuntos
Genoma de Planta , Zea mays , Fenótipo , Zea mays/genética , Estações do Ano , Genótipo , Genoma de Planta/genética
4.
G3 (Bethesda) ; 13(4)2023 04 11.
Artigo em Inglês | MEDLINE | ID: mdl-36625555

RESUMO

Accurate prediction of the phenotypic outcomes produced by different combinations of genotypes, environments, and management interventions remains a key goal in biology with direct applications to agriculture, research, and conservation. The past decades have seen an expansion of new methods applied toward this goal. Here we predict maize yield using deep neural networks, compare the efficacy of 2 model development methods, and contextualize model performance using conventional linear and machine learning models. We examine the usefulness of incorporating interactions between disparate data types. We find deep learning and best linear unbiased predictor (BLUP) models with interactions had the best overall performance. BLUP models achieved the lowest average error, but deep learning models performed more consistently with similar average error. Optimizing deep neural network submodules for each data type improved model performance relative to optimizing the whole model for all data types at once. Examining the effect of interactions in the best-performing model revealed that including interactions altered the model's sensitivity to weather and management features, including a reduction of the importance scores for timepoints expected to have a limited physiological basis for influencing yield-those at the extreme end of the season, nearly 200 days post planting. Based on these results, deep learning provides a promising avenue for the phenotypic prediction of complex traits in complex environments and a potential mechanism to better understand the influence of environmental and genetic factors.


Assuntos
Aprendizado Profundo , Redes Neurais de Computação , Aprendizado de Máquina , Genótipo , Herança Multifatorial
5.
G3 (Bethesda) ; 13(2)2023 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-36454082

RESUMO

Identifying selection on polygenic complex traits in crops and livestock is important for understanding evolution and helps prioritize important characteristics for breeding. Quantitative trait loci (QTL) that contribute to polygenic trait variation often exhibit small or infinitesimal effects. This hinders the ability to detect QTL-controlling polygenic traits because enormously high statistical power is needed for their detection. Recently, we circumvented this challenge by introducing a method to identify selection on complex traits by evaluating the relationship between genome-wide changes in allele frequency and estimates of effect size. The approach involves calculating a composite statistic across all markers that capture this relationship, followed by implementing a linkage disequilibrium-aware permutation test to evaluate if the observed pattern differs from that expected due to drift during evolution and population stratification. In this manuscript, we describe "Ghat," an R package developed to implement this method to test for selection on polygenic traits. We demonstrate the package by applying it to test for polygenic selection on 15 published European wheat traits including yield, biomass, quality, morphological characteristics, and disease resistance traits. Moreover, we applied Ghat to different simulated populations with different breeding histories and genetic architectures. The results highlight the power of Ghat to identify selection on complex traits. The Ghat package is accessible on CRAN, the Comprehensive R Archival Network, and on GitHub.


Assuntos
Herança Multifatorial , Melhoramento Vegetal , Herança Multifatorial/genética , Locos de Características Quantitativas , Desequilíbrio de Ligação , Frequência do Gene , Fenótipo
6.
Plant Genome ; 15(4): e20257, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36258672

RESUMO

Low-density genotyping followed by imputation reduces genotyping costs while still providing high-density marker information. An increased marker density has the potential to improve the outcome of all applications that are based on genomic data. This study investigates techniques for 1k to 20k genomic marker imputation for plant breeding programs with sugar beet (Beta vulgaris L. ssp. vulgaris) as an example crop, where these are realistic marker numbers for modern breeding applications. The generally accepted 'gold standard' for imputation, Beagle 5.1, was compared with the recently developed software AlphaPlantImpute2 which is designed specifically for plant breeding. For Beagle 5.1 and AlphaPlantImpute2, the imputation strategy as well as the imputation parameters were optimized in this study. We found that the imputation accuracy of Beagle could be tremendously improved (0.22 to 0.67) by tuning parameters, mainly by lowering the values for the parameter for the effective population size and increasing the number of iterations performed. Separating the phasing and imputation steps also improved accuracies when optimized parameters were used (0.67 to 0.82). We also found that the imputation accuracy of Beagle decreased when more low-density lines were included for imputation. AlphaPlantImpute2 produced very high accuracies without optimization (0.89) and was generally less responsive to optimization. Overall, AlphaPlantImpute2 performed relatively better for imputation whereas Beagle was better for phasing. Combining both tools yielded the highest accuracies.


Assuntos
Beta vulgaris , Cães , Animais , Beta vulgaris/genética , Genótipo , Melhoramento Vegetal , Polimorfismo de Nucleotídeo Único , Açúcares
7.
G3 (Bethesda) ; 12(11)2022 11 04.
Artigo em Inglês | MEDLINE | ID: mdl-36124944

RESUMO

We introduce the R-package learnMET, developed as a flexible framework to enable a collection of analyses on multi-environment trial breeding data with machine learning-based models. learnMET allows the combination of genomic information with environmental data such as climate and/or soil characteristics. Notably, the package offers the possibility of incorporating weather data from field weather stations, or to retrieve global meteorological datasets from a NASA database. Daily weather data can be aggregated over specific periods of time based on naive (for instance, nonoverlapping 10-day windows) or phenological approaches. Different machine learning methods for genomic prediction are implemented, including gradient-boosted decision trees, random forests, stacked ensemble models, and multilayer perceptrons. These prediction models can be evaluated via a collection of cross-validation schemes that mimic typical scenarios encountered by plant breeders working with multi-environment trial experimental data in a user-friendly way. The package is published under an MIT license and accessible on GitHub.


Assuntos
Genômica , Aprendizado de Máquina , Genômica/métodos , Redes Neurais de Computação
8.
BMC Plant Biol ; 22(1): 87, 2022 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-35219296

RESUMO

BACKGROUND: Genomic selection is a powerful tool in plant breeding. By building a prediction model using a training set with markers and phenotypes, genomic estimated breeding values (GEBVs) can be used as predictions of breeding values in a target set with only genotype data. There is, however, limited information on how prediction accuracy of genomic prediction can be optimized. The objective of this study was to evaluate the performance of 11 genomic prediction models across species in terms of prediction accuracy for two traits with different heritabilities using several subsets of markers and training population proportions. Species studied were maize (Zea mays, L.), soybean (Glycine max, L.), and rice (Oryza sativa, L.), which vary in linkage disequilibrium (LD) decay rates and have contrasting genetic architectures. RESULTS: Correlations between observed and predicted GEBVs were determined via cross validation for three training-to-testing proportions (90:10, 70:30, and 50:50). Maize, which has the shortest extent of LD, showed the highest prediction accuracy. Amongst all the models tested, Bayes B performed better than or equal to all other models for each trait in all the three crops. Traits with higher broad-sense and narrow-sense heritabilities were associated with higher prediction accuracy. When subsets of markers were selected based on LD, the accuracy was similar to that observed from the complete set of markers. However, prediction accuracies were significantly improved when using a subset of total markers that were significant at P ≤ 0.05 or P ≤ 0.10. As expected, exclusion of QTL-associated markers in the model reduced prediction accuracy. Prediction accuracy varied among different training population proportions. CONCLUSIONS: We conclude that prediction accuracy for genomic selection can be improved by using the Bayes B model with a subset of significant markers and by selecting the training population based on narrow sense heritability.


Assuntos
Glycine max/genética , Modelos Genéticos , Oryza/genética , Zea mays/genética , Marcadores Genéticos , Genoma de Planta , Desequilíbrio de Ligação , Oryza/fisiologia , Fenótipo , Melhoramento Vegetal , Polimorfismo de Nucleotídeo Único , Glycine max/fisiologia , Zea mays/fisiologia
9.
Front Plant Sci ; 12: 699589, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34880880

RESUMO

The development of crop varieties with stable performance in future environmental conditions represents a critical challenge in the context of climate change. Environmental data collected at the field level, such as soil and climatic information, can be relevant to improve predictive ability in genomic prediction models by describing more precisely genotype-by-environment interactions, which represent a key component of the phenotypic response for complex crop agronomic traits. Modern predictive modeling approaches can efficiently handle various data types and are able to capture complex nonlinear relationships in large datasets. In particular, machine learning techniques have gained substantial interest in recent years. Here we examined the predictive ability of machine learning-based models for two phenotypic traits in maize using data collected by the Maize Genomes to Fields (G2F) Initiative. The data we analyzed consisted of multi-environment trials (METs) dispersed across the United States and Canada from 2014 to 2017. An assortment of soil- and weather-related variables was derived and used in prediction models alongside genotypic data. Linear random effects models were compared to a linear regularized regression method (elastic net) and to two nonlinear gradient boosting methods based on decision tree algorithms (XGBoost, LightGBM). These models were evaluated under four prediction problems: (1) tested and new genotypes in a new year; (2) only unobserved genotypes in a new year; (3) tested and new genotypes in a new site; (4) only unobserved genotypes in a new site. Accuracy in forecasting grain yield performance of new genotypes in a new year was improved by up to 20% over the baseline model by including environmental predictors with gradient boosting methods. For plant height, an enhancement of predictive ability could neither be observed by using machine learning-based methods nor by using detailed environmental information. An investigation of key environmental factors using gradient boosting frameworks also revealed that temperature at flowering stage, frequency and amount of water received during the vegetative and grain filling stage, and soil organic matter content appeared as important predictors for grain yield in our panel of environments.

10.
Mol Biol Evol ; 38(10): 4419-4434, 2021 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-34157722

RESUMO

Understanding the evolutionary history of crops, including identifying wild relatives, helps to provide insight for conservation and crop breeding efforts. Cultivated Brassica oleracea has intrigued researchers for centuries due to its wide diversity in forms, which include cabbage, broccoli, cauliflower, kale, kohlrabi, and Brussels sprouts. Yet, the evolutionary history of this species remains understudied. With such different vegetables produced from a single species, B. oleracea is a model organism for understanding the power of artificial selection. Persistent challenges in the study of B. oleracea include conflicting hypotheses regarding domestication and the identity of the closest living wild relative. Using newly generated RNA-seq data for a diversity panel of 224 accessions, which represents 14 different B. oleracea crop types and nine potential wild progenitor species, we integrate phylogenetic and population genetic techniques with ecological niche modeling, archaeological, and literary evidence to examine relationships among cultivars and wild relatives to clarify the origin of this horticulturally important species. Our analyses point to the Aegean endemic B. cretica as the closest living relative of cultivated B. oleracea, supporting an origin of cultivation in the Eastern Mediterranean region. Additionally, we identify several feral lineages, suggesting that cultivated plants of this species can revert to a wild-like state with relative ease. By expanding our understanding of the evolutionary history in B. oleracea, these results contribute to a growing body of knowledge on crop domestication that will facilitate continued breeding efforts including adaptation to changing environmental conditions.


Assuntos
Brassica , Melhoramento Vegetal , Evolução Biológica , Brassica/genética , Produtos Agrícolas/genética , Filogenia
11.
Plant Cell Physiol ; 62(7): 1199-1214, 2021 Oct 29.
Artigo em Inglês | MEDLINE | ID: mdl-34015110

RESUMO

The strength of the stalk rind, measured as rind penetrometer resistance (RPR), is an important contributor to stalk lodging resistance. To enhance the genetic architecture of RPR, we combined selection mapping on populations developed by 15 cycles of divergent selection for high and low RPR with time-course transcriptomic and metabolic analyses of the stalks. Divergent selection significantly altered allele frequencies of 3,656 and 3,412 single- nucleotide polymorphisms (SNPs) in the high and low RPR populations, respectively. Surprisingly, only 110 (1.56%) SNPs under selection were common in both populations, while the majority (98.4%) were unique to each population. This result indicated that high and low RPR phenotypes are produced by biologically distinct mechanisms. Remarkably, regions harboring lignin and polysaccharide genes were preferentially selected in high and low RPR populations, respectively. The preferential selection was manifested as higher lignification and increased saccharification of the high and low RPR stalks, respectively. The evolution of distinct gene classes according to the direction of selection was unexpected in the context of parallel evolution and demonstrated that selection for a trait, albeit in different directions, does not necessarily act on the same genes. Tricin, a grass-specific monolignol that initiates the incorporation of lignin in the cell walls, emerged as a key determinant of RPR. Integration of selection mapping and transcriptomic analyses with published genetic studies of RPR identified several candidate genes including ZmMYB31, ZmNAC25, ZmMADS1, ZmEXPA2, ZmIAA41 and hk5. These findings provide a foundation for an enhanced understanding of RPR and the improvement of stalk lodging resistance.


Assuntos
Zea mays/genética , Parede Celular/metabolismo , Evolução Molecular , Perfilação da Expressão Gênica , Frequência do Gene , Metabolômica , Polimorfismo de Nucleotídeo Único/genética , Característica Quantitativa Herdável , Zea mays/anatomia & histologia
13.
G3 (Bethesda) ; 10(11): 4227-4239, 2020 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-32978264

RESUMO

Plant growth, development, and nutritional quality depends upon amino acid homeostasis, especially in seeds. However, our understanding of the underlying genetics influencing amino acid content and composition remains limited, with only a few candidate genes and quantitative trait loci identified to date. Improved knowledge of the genetics and biological processes that determine amino acid levels will enable researchers to use this information for plant breeding and biological discovery. Toward this goal, we used genomic prediction to identify biological processes that are associated with, and therefore potentially influence, free amino acid (FAA) composition in seeds of the model plant Arabidopsis thaliana Markers were split into categories based on metabolic pathway annotations and fit using a genomic partitioning model to evaluate the influence of each pathway on heritability explained, model fit, and predictive ability. Selected pathways included processes known to influence FAA composition, albeit to an unknown degree, and spanned four categories: amino acid, core, specialized, and protein metabolism. Using this approach, we identified associations for pathways containing known variants for FAA traits, in addition to finding new trait-pathway associations. Markers related to amino acid metabolism, which are directly involved in FAA regulation, improved predictive ability for branched chain amino acids and histidine. The use of genomic partitioning also revealed patterns across biochemical families, in which serine-derived FAAs were associated with protein related annotations and aromatic FAAs were associated with specialized metabolic pathways. Taken together, these findings provide evidence that genomic partitioning is a viable strategy to uncover the relative contributions of biological processes to FAA traits in seeds, offering a promising framework to guide hypothesis testing and narrow the search space for candidate genes.


Assuntos
Arabidopsis , Fenômenos Biológicos , Aminoácidos , Arabidopsis/genética , Genômica , Humanos , Melhoramento Vegetal , Sementes/genética
14.
Curr Opin Plant Biol ; 54: 93-100, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32325397

RESUMO

Crop domestication is a fascinating area of study, as shown by a multitude of recent reviews. Coupled with the increasing availability of genomic and phenomic resources in numerous crop species, insights from evolutionary biology will enable a deeper understanding of the genetic architecture and short-term evolution of complex traits, which can be used to inform selection strategies. Future advances in crop improvement will rely on the integration of population genetics with plant breeding methodology, and the development of community resources to support research in a variety of crop life histories and reproductive strategies. We highlight recent advances related to the role of selective sweeps and demographic history in shaping genetic architecture, how these breakthroughs can inform selection strategies, and the application of precision gene editing to leverage these connections.


Assuntos
Domesticação , Melhoramento Vegetal , Cruzamento , Edição de Genes , Plantas/genética
15.
BMC Plant Biol ; 19(1): 412, 2019 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-31590656

RESUMO

BACKGROUND: Genome wide association studies (GWAS) are a powerful tool for identifying quantitative trait loci (QTL) and causal single nucleotide polymorphisms (SNPs)/genes associated with various important traits in crop species. Typically, GWAS in crops are performed using a panel of inbred lines, where multiple replicates of the same inbred are measured and the average phenotype is taken as the response variable. Here we describe and evaluate single plant GWAS (sp-GWAS) for performing a GWAS on individual plants, which does not require an association panel of inbreds. Instead sp-GWAS relies on the phenotypes and genotypes from individual plants sampled from a randomly mating population. Importantly, we demonstrate how sp-GWAS can be efficiently combined with a bulk segregant analysis (BSA) experiment to rapidly corroborate evidence for significant SNPs. RESULTS: In this study we used the Shoepeg maize landrace, collected as an open pollinating variety from a farm in Southern Missouri in the 1960's, to evaluate whether sp-GWAS coupled with BSA can efficiently and powerfully used to detect significant association of SNPs for plant height (PH). Plant were grown in 8 locations across two years and in total 768 individuals were genotyped and phenotyped for sp-GWAS. A total of 306 k polymorphic markers in 768 individuals evaluated via association analysis detected 25 significant SNPs (P ≤ 0.00001) for PH. The results from our single-plant GWAS were further validated by bulk segregant analysis (BSA) for PH. BSA sequencing was performed on the same population by selecting tall and short plants as separate bulks. This approach identified 37 genomic regions for plant height. Of the 25 significant SNPs from GWAS, the three most significant SNPs co-localize with regions identified by BSA. CONCLUSION: Overall, this study demonstrates that sp-GWAS coupled with BSA can be a useful tool for detecting significant SNPs and identifying candidate genes. This result is particularly useful for species/populations where association panels are not readily available.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único/genética , Zea mays/genética , Cromossomos de Plantas/genética , Genoma de Planta/genética , Desequilíbrio de Ligação/genética , Locos de Características Quantitativas/genética
16.
Plant Cell ; 31(9): 1968-1989, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31239390

RESUMO

Premature senescence in annual crops reduces yield, while delayed senescence, termed stay-green, imposes positive and negative impacts on yield and nutrition quality. Despite its importance, scant information is available on the genetic architecture of senescence in maize (Zea mays) and other cereals. We combined a systematic characterization of natural diversity for senescence in maize and coexpression networks derived from transcriptome analysis of normally senescing and stay-green lines. Sixty-four candidate genes were identified by genome-wide association study (GWAS), and 14 of these genes are supported by additional evidence for involvement in senescence-related processes including proteolysis, sugar transport and signaling, and sink activity. Eight of the GWAS candidates, independently supported by a coexpression network underlying stay-green, include a trehalose-6-phosphate synthase, a NAC transcription factor, and two xylan biosynthetic enzymes. Source-sink communication and the activity of cell walls as a secondary sink emerge as key determinants of stay-green. Mutant analysis supports the role of a candidate encoding Cys protease in stay-green in Arabidopsis (Arabidopsis thaliana), and analysis of natural alleles suggests a similar role in maize. This study provides a foundation for enhanced understanding and manipulation of senescence for increasing carbon yield, nutritional quality, and stress tolerance of maize and other cereals.


Assuntos
Envelhecimento/genética , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes , Genes de Plantas/genética , Zea mays/genética , Arabidopsis/genética , Perfilação da Expressão Gênica , Estudo de Associação Genômica Ampla , Glucosiltransferases/genética , Folhas de Planta , Polimorfismo de Nucleotídeo Único , Fatores de Transcrição/genética , Transcriptoma
17.
Front Plant Sci ; 10: 1794, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-32158452

RESUMO

Association mapping (AM) is a powerful tool for fine mapping complex trait variation down to nucleotide sequences by exploiting historical recombination events. A major problem in AM is controlling false positives that can arise from population structure and family relatedness. False positives are often controlled by incorporating covariates for structure and kinship in mixed linear models (MLM). These MLM-based methods are single locus models and can introduce false negatives due to over fitting of the model. In this study, eight different statistical models, ranging from single-locus to multilocus, were compared for AM for three traits differing in heritability in two crop species: soybean (Glycine max L.) and maize (Zea mays L.). Soybean and maize were chosen, in part, due to their highly differentiated rate of linkage disequilibrium (LD) decay, which can influence false positive and false negative rates. The fixed and random model circulating probability unification (FarmCPU) performed better than other models based on an analysis of Q-Q plots and on the identification of the known number of quantitative trait loci (QTLs) in a simulated data set. These results indicate that the FarmCPU controls both false positives and false negatives. Six qualitative traits in soybean with known published genomic positions were also used to compare these models, and results indicated that the FarmCPU consistently identified a single highly significant SNP closest to these known published genes. Multiple comparison adjustments (Bonferroni, false discovery rate, and positive false discovery rate) were compared for these models using a simulated trait having 60% heritability and 20 QTLs. Multiple comparison adjustments were overly conservative for MLM, CMLM, ECMLM, and MLMM and did not find any significant markers; in contrast, ANOVA, GLM, and SUPER models found an excessive number of markers, far more than 20 QTLs. The FarmCPU model, using less conservative methods (false discovery rate, and positive false discovery rate) identified 10 QTLs, which was closer to the simulated number of QTLs than the number found by other models.

18.
Genome Biol ; 18(1): 215, 2017 11 13.
Artigo em Inglês | MEDLINE | ID: mdl-29132403

RESUMO

BACKGROUND: The history of maize has been characterized by major demographic events, including population size changes associated with domestication and range expansion, and gene flow with wild relatives. The interplay between demographic history and selection has shaped diversity across maize populations and genomes. RESULTS: We investigate these processes using high-depth resequencing data from 31 maize landraces spanning the pre-Columbian distribution of maize, and four wild teosinte individuals (Zea mays ssp. parviglumis). Genome-wide demographic analyses reveal that maize experienced pronounced declines in effective population size due to both a protracted domestication bottleneck and serial founder effects during post-domestication spread, while parviglumis in the Balsas River Valley experienced population growth. The domestication bottleneck and subsequent spread led to an increase in deleterious alleles in the domesticate compared to the wild progenitor. This cost is particularly pronounced in Andean maize, which has experienced a more dramatic founder event compared to other maize populations. Additionally, we detect introgression from the wild teosinte Zea mays ssp. mexicana into maize in the highlands of Mexico, Guatemala, and the southwestern USA, which reduces the prevalence of deleterious alleles likely due to the higher long-term effective population size of teosinte. CONCLUSIONS: These findings underscore the strong interaction between historical demography and the efficiency of selection and illustrate how domesticated species are particularly useful for understanding these processes. The landscape of deleterious alleles and therefore evolutionary potential is clearly influenced by recent demography, a factor that could bear importantly on many species that have experienced recent demographic shifts.


Assuntos
Domesticação , Seleção Genética , Zea mays/crescimento & desenvolvimento , Zea mays/genética , Alelos , Endogamia , Mutação/genética , Densidade Demográfica
19.
Plant Methods ; 13: 8, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28250803

RESUMO

BACKGROUND: High-density marker panels and/or whole-genome sequencing, coupled with advanced phenotyping pipelines and sophisticated statistical methods, have dramatically increased our ability to generate lists of candidate genes or regions that are putatively associated with phenotypes or processes of interest. However, the speed with which we can validate genes, or even make reasonable biological interpretations about the principles underlying them, has not kept pace. A promising approach that runs parallel to explicitly validating individual genes is analyzing a set of genes together and assessing the biological similarities among them. This is often achieved via gene ontology analysis, a powerful tool that involves evaluating publicly available gene annotations. However, additional resources such as Medical Subject Headings (MeSH) can also be used to evaluate sets of genes to make biological interpretations. RESULTS: In this manuscript, we describe utilizing MeSH terms to make biological interpretations in maize. MeSH terms are assigned to PubMed-indexed manuscripts by the National Library of Medicine, and can be directly mapped to genes to develop gene annotations. Once mapped, these terms can be evaluated for enrichment in sets of genes or similarity between gene sets to provide biological insights. Here, we implement MeSH analyses in five maize datasets to demonstrate how MeSH can be leveraged by the maize and broader crop-genomics community. CONCLUSIONS: We demonstrate that MeSH terms can be effectively leveraged to generate hypotheses and make biological interpretations in maize, and we provide a pipeline that enables the use of MeSH terms in other plant species.

20.
G3 (Bethesda) ; 6(8): 2447-53, 2016 08 09.
Artigo em Inglês | MEDLINE | ID: mdl-27261003

RESUMO

Biomedical vocabularies and ontologies aid in recapitulating biological knowledge. The annotation of gene products is mainly accelerated by Gene Ontology (GO), and more recently by Medical Subject Headings (MeSH). Here, we report a suite of MeSH packages for chicken in Bioconductor, and illustrate some features of different MeSH-based analyses, including MeSH-informed enrichment analysis and MeSH-guided semantic similarity among terms and gene products, using two lists of chicken genes available in public repositories. The two published datasets that were employed represent (i) differentially expressed genes, and (ii) candidate genes under selective sweep or epistatic selection. The comparison of MeSH with GO overrepresentation analyses suggested not only that MeSH supports the findings obtained from GO analysis, but also that MeSH is able to further enrich the representation of biological knowledge and often provide more interpretable results. Based on the hierarchical structures of MeSH and GO, we computed semantic similarities among vocabularies, as well as semantic similarities among selected genes. These yielded the similarity levels between significant functional terms, and the annotation of each gene yielded the measures of gene similarity. Our findings show the benefits of using MeSH as an alternative choice of annotation in order to draw biological inferences from a list of genes of interest. We argue that the use of MeSH in conjunction with GO will be instrumental in facilitating the understanding of the genetic basis of complex traits.


Assuntos
Galinhas/genética , Ontologia Genética , Medical Subject Headings , Semântica , Animais , Galinhas/classificação , Bases de Dados Genéticas , Humanos , Terminologia como Assunto , Vocabulário Controlado
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...